Simulation Study of the Effectiveness of Masking Microdata with Mixtures of Multivariate Normal Distributions
نویسنده
چکیده
Continuous variables in microdata can be masked for protection from disclosure through the use of an additive noise. I consider adding noise that is distributed according to a mixture of normal distributions. There are several parameters involved in constructing the additive noise. The study’s purpose is to lay down as a guide a recipe for the choices of these parameters. The proportion of reidentifiable records through the use of Winkler’s matching software measures effectiveness of the masking method. Results depend heavily on the matching software used.
منابع مشابه
Comparing Mean Vectors Via Generalized Inference in Multivariate Log-Normal Distributions
Abstract In this paper, we consider the problem of means in several multivariate log-normal distributions and propose a useful method called as generalized variable method. Simulation studies show that suggested method has a appropriate size and power regardless sample size. To evaluation this method, we compare this method with traditional MANOVA such that the actual sizes of the two methods ...
متن کاملWP. 11 ENGLISH ONLY UNITED NATIONS STATISTICAL COMMISSION and ECONOMIC COMMISSION FOR EUROPE CONFERENCE OF EUROPEAN STATISTICIANS EUROPEAN COMMISSION STATISTICAL OFFICE OF THE EUROPEAN COMMUNITIES (EUROSTAT)
Statistically defensible methods for disclosure limitation allow data users to make inferences about parameters in a model similar to those that would be possible using the original unreleased data. We present a new perturbation method for protecting confidential attributes in continuous microdata—Random Orthogonal Matrix Masking (ROMM) which preserves the sufficient statistics for multivariate...
متن کاملReverse Mapping to Preserve the Marginal Distributions of Attributes in Masked Microdata
In this paper we describe a new procedure that is capable of ensuring that the marginal distributions of attributes in microdata masked with a masking mechanism end up being the same as the marginal distributions of attributes in the original data. We illustrate the application of the new procedure using several commonly used masking mechanisms.
متن کاملThe Analysis of Bayesian Probit Regression of Binary and Polychotomous Response Data
The goal of this study is to introduce a statistical method regarding the analysis of specific latent data for regression analysis of the discrete data and to build a relation between a probit regression model (related to the discrete response) and normal linear regression model (related to the latent data of continuous response). This method provides precise inferences on binary and multinomia...
متن کاملOn the non-parametric multivariate control charts in fuzzy environment
Multivariate control chats are generally used in situations where the simultaneous monitoring or control of two or more related quality characteristics is necessary. In most processes in the real world, distribution of the process characteristics are unknown or at least non-normal, so the non-parametric or distribution-free charts are desirable. Most non-parametric statistical process-control t...
متن کامل